478 research outputs found
A storage and access architecture for efficient query processing in spatial database systems
Due to the high complexity of objects and queries and also due to extremely
large data volumes, geographic database systems impose stringent requirements on their
storage and access architecture with respect to efficient query processing. Performance
improving concepts such as spatial storage and access structures, approximations, object
decompositions and multi-phase query processing have been suggested and analyzed as
single building blocks. In this paper, we describe a storage and access architecture which
is composed from the above building blocks in a modular fashion. Additionally, we incorporate
into our architecture a new ingredient, the scene organization, for efficiently
supporting set-oriented access of large-area region queries. An experimental performance
comparison demonstrates that the concept of scene organization leads to considerable
performance improvements for large-area region queries by a factor of up to 150
Multi-Step Processing of Spatial Joins
Spatial joins are one of the most important operations for combining spatial objects of several relations. In this paper, spatial join processing is studied in detail for extended spatial objects in twodimensional data space. We present an approach for spatial join processing that is based on three steps. First, a spatial join is performed on the minimum bounding rectangles of the objects returning a set of candidates. Various approaches for accelerating this step of join processing have been examined at the last yearâs conference [BKS 93a]. In this paper, we focus on the problem how to compute the answers from the set of candidates which is handled by
the following two steps. First of all, sophisticated approximations
are used to identify answers as well as to filter out false hits from
the set of candidates. For this purpose, we investigate various types
of conservative and progressive approximations. In the last step, the
exact geometry of the remaining candidates has to be tested against
the join predicate. The time required for computing spatial join
predicates can essentially be reduced when objects are adequately
organized in main memory. In our approach, objects are first decomposed
into simple components which are exclusively organized
by a main-memory resident spatial data structure. Overall, we
present a complete approach of spatial join processing on complex
spatial objects. The performance of the individual steps of our approach
is evaluated with data sets from real cartographic applications.
The results show that our approach reduces the total execution
time of the spatial join by factors
Querying Probabilistic Neighborhoods in Spatial Data Sets Efficiently
In this paper we define the notion
of a probabilistic neighborhood in spatial data: Let a set of points in
, a query point , a distance metric \dist,
and a monotonically decreasing function be
given. Then a point belongs to the probabilistic neighborhood of with respect to with probability f(\dist(p,q)). We envision
applications in facility location, sensor networks, and other scenarios where a
connection between two entities becomes less likely with increasing distance. A
straightforward query algorithm would determine a probabilistic neighborhood in
time by probing each point in .
To answer the query in sublinear time for the planar case, we augment a
quadtree suitably and design a corresponding query algorithm. Our theoretical
analysis shows that -- for certain distributions of planar -- our algorithm
answers a query in time with high probability
(whp). This matches up to a logarithmic factor the cost induced by
quadtree-based algorithms for deterministic queries and is asymptotically
faster than the straightforward approach whenever .
As practical proofs of concept we use two applications, one in the Euclidean
and one in the hyperbolic plane. In particular, our results yield the first
generator for random hyperbolic graphs with arbitrary temperatures in
subquadratic time. Moreover, our experimental data show the usefulness of our
algorithm even if the point distribution is unknown or not uniform: The running
time savings over the pairwise probing approach constitute at least one order
of magnitude already for a modest number of points and queries.Comment: The final publication is available at Springer via
http://dx.doi.org/10.1007/978-3-319-44543-4_3
Efficient Processing of Spatial Joins Using R-Trees
Abstract: In this paper, we show that spatial joins are very suitable to be processed on a parallel hardware platform. The parallel system is equipped with a so-called shared virtual memory which is well-suited for the design and implementation of parallel spatial join algorithms. We start with an algorithm that consists of three phases: task creation, task assignment and parallel task execu-tion. In order to reduce CPU- and I/O-cost, the three phases are processed in a fashion that pre-serves spatial locality. Dynamic load balancing is achieved by splitting tasks into smaller ones and reassigning some of the smaller tasks to idle processors. In an experimental performance compar-ison, we identify the advantages and disadvantages of several variants of our algorithm. The most efficient one shows an almost optimal speed-up under the assumption that the number of disks is sufficiently large. Topics: spatial database systems, parallel database systems
Query processing of spatial objects: Complexity versus Redundancy
The management of complex spatial objects in applications, such as geography and cartography,
imposes stringent new requirements on spatial database systems, in particular on efficient
query processing. As shown before, the performance of spatial query processing can be improved
by decomposing complex spatial objects into simple components. Up to now, only decomposition
techniques generating a linear number of very simple components, e.g. triangles or trapezoids, have
been considered. In this paper, we will investigate the natural trade-off between the complexity of
the components and the redundancy, i.e. the number of components, with respect to its effect on
efficient query processing. In particular, we present two new decomposition methods generating
a better balance between the complexity and the number of components than previously known
techniques. We compare these new decomposition methods to the traditional undecomposed representation
as well as to the well-known decomposition into convex polygons with respect to their
performance in spatial query processing. This comparison points out that for a wide range of query
selectivity the new decomposition techniques clearly outperform both the undecomposed representation
and the convex decomposition method. More important than the absolute gain in performance
by a factor of up to an order of magnitude is the robust performance of our new decomposition
techniques over the whole range of query selectivity
The Real Combination Problem : Panpsychism, Micro-Subjects, and Emergence
Panpsychism harbors an unresolved tension, the seriousness of which has yet to be fully appreciated. I capture this tension as a dilemma, and offer panpsychists advice on how to resolve it. The dilemma, briefly, is as follows. Panpsychists are committed to the perspicuous explanation of macro-mentality in terms of micro-mentality. But panpsychists take the micro-material realm to feature not just mental properties, but also micro-subjects to whom these properties belong. Yet it is impossible to explain the constitution of a macro-subject (like one of us) in terms of the assembly of micro-subjects, for, I show, subjects cannot combine. Therefore the panpsychist explanatory project is derailed by the insistence that the worldâs ultimate material constituents (ultimates) are subjects of experience. The panpsychist faces a choice of abandoning her explanatory project, or recanting the claim that the ultimates are subjects. This is the dilemma. I argue that the latter option is to be preferred. This neednât constitute a wholesale abandonment of panpsychism, however, since panpsychists can maintain that the ultimates possess phenomenal qualities, despite not being subjects of those qualities. This proposal requires us to make sense of phenomenal qualities existing independently of experiencing subjects, a challenge I tackle in the penultimate section. The position eventually reached is a form of neutral monism, so another way to express the overall argument is to say that, keeping true to their philosophical motivations, panpsychists should really be neutral monists.Peer reviewedFinal Accepted Versio
BETULA: Numerically Stable CF-Trees for BIRCH Clustering
BIRCH clustering is a widely known approach for clustering, that has
influenced much subsequent research and commercial products. The key
contribution of BIRCH is the Clustering Feature tree (CF-Tree), which is a
compressed representation of the input data. As new data arrives, the tree is
eventually rebuilt to increase the compression. Afterward, the leaves of the
tree are used for clustering. Because of the data compression, this method is
very scalable. The idea has been adopted for example for k-means, data stream,
and density-based clustering.
Clustering features used by BIRCH are simple summary statistics that can
easily be updated with new data: the number of points, the linear sums, and the
sum of squared values. Unfortunately, how the sum of squares is then used in
BIRCH is prone to catastrophic cancellation.
We introduce a replacement cluster feature that does not have this numeric
problem, that is not much more expensive to maintain, and which makes many
computations simpler and hence more efficient. These cluster features can also
easily be used in other work derived from BIRCH, such as algorithms for
streaming data. In the experiments, we demonstrate the numerical problem and
compare the performance of the original algorithm compared to the improved
cluster features
Personhood, consciousness, and god : how to be a proper pantheist
© Springer Nature B.V. 2018In this paper I develop a theory of personhood which leaves open the possibility of construing the universe as a person. If successful, it removes one bar to endorsing pantheism. I do this by examining a rising school of thought on personhood, on which persons, or selves, are understood as identical to episodes of consciousness. Through a critique of this experiential approach to personhood, I develop a theory of self as constituted of qualitative mental contents, but where these contents are also capable of unconscious existence. On this theory, though we can be conscious of our selves, consciousness turns out to be inessential to personhood. This move, I then argue, provides resources for responding to the pantheistâs problem of Godâs person.Peer reviewedFinal Accepted Versio
Improving cluster recovery with feature rescaling factors
The data preprocessing stage is crucial in clustering. Features may describe entities using different scales. To rectify this, one usually applies feature normalisation aiming at rescaling features so that none of them overpowers the others in the objective function of the selected clustering algorithm. In this paper, we argue that the rescaling procedure should not treat all features identically. Instead, it should favour the features that are more meaningful for clustering. With this in mind, we introduce a feature rescaling method that takes into account the within-cluster degree of relevance of each feature. Our comprehensive simulation study, carried out on real and synthetic data, with and without noise features, clearly demonstrates that clustering methods that use the proposed data normalization strategy clearly outperform those that use traditional data normalization
- âŠ